AITopics | adder neural network

Supplementary Materials: An Empirical Study of Adder Neural Networks for Object Detection

Neural Information Processing SystemsApr-25-2026, 11:37:54 GMT

As discussed in prior literature [1, 4], one operation of floating-point addition and multiplication have energy costs of 0.9 pJ and 3.7 pJ, respectively. Meanwhile, one operation of 8-bit integer addition and multiplication have 0.03 pJ and 0.2 pJ energy costs, demonstrating much lower cost than floating-point operation. Therefore, it is important to explore whether adder detectors performs well for INT8 quantization. We tried to adopt INT8 post quantization for our Adder FCOS (B+N) model, which suffers 0.8 mAP drop compared with full precision model, as shown in Table A. The energy reduction further increases from 29% to 35%. Note that post training quantization is not optimal for INT8 models, and quantization-aware training may greatly further improve the accuracy.

artificial intelligence, detector, machine learning, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.41)

Add feedback

Kernel Based Progressive Distillation for Adder Neural Networks

Neural Information Processing SystemsDec-24-2025, 07:09:43 GMT

Adder Neural Networks (ANNs) which only contain additions bring us a new way of developing deep neural networks with low energy consumption. Unfortunately, there is an accuracy drop when replacing all convolution filters by adder filters. The main reason here is the optimization difficulty of ANNs using $\ell_1$-norm, in which the estimation of gradient in back propagation is inaccurate. In this paper, we present a novel method for further improving the performance of ANNs without increasing the trainable parameters via a progressive kernel based knowledge distillation (PKKD) method. A convolutional neural network (CNN) with the same architecture is simultaneously initialized and trained as a teacher network, features and weights of ANN and CNN will be transformed to a new space to eliminate the accuracy drop. The similarity is conducted in a higher-dimensional space to disentangle the difference of their distributions using a kernel based method. Finally, the desired ANN is learned based on the information from both the ground-truth and teacher, progressively. The effectiveness of the proposed method for learning ANN with higher performance is then well-verified on several benchmarks. For instance, the ANN-50 trained using the proposed PKKD method obtains a 76.8\% top-1 accuracy on ImageNet dataset, which is 0.6\% higher than that of the ResNet-50.

adder neural network, name change, progressive distillation, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.59)

Add feedback

An Empirical Study of Adder Neural Networks for Object Detection

Neural Information Processing SystemsDec-24-2025, 00:06:31 GMT

Adder neural networks (AdderNets) have shown impressive performance on image classification with only addition operations, which are more energy efficient than traditional convolutional neural networks built with multiplications. Compared with classification, there is a strong demand on reducing the energy consumption of modern object detectors via AdderNets for real-world applications such as autonomous driving and face detection. In this paper, we present an empirical study of AdderNets for object detection. We first reveal that the batch normalization statistics in the pre-trained adder backbone should not be frozen, since the relatively large feature variance of AdderNets. Moreover, we insert more shortcut connections in the neck part and design a new feature fusion architecture for avoiding the sparse features of adder layers. We present extensive ablation studies to explore several design choices of adder detectors. Comparisons with state-of-the-arts are conducted on COCO and PASCAL VOC benchmarks. Specifically, the proposed Adder FCOS achieves a 37.8% AP on the COCO val set, demonstrating comparable performance to that of the convolutional counterpart with an about $1.4\times$ energy reduction.

adder neural network, empirical study, name change, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.87)

Add feedback

Supplementary Material: Progressive Kernel Based Knowledge Distillation for Adder Neural Networks

Neural Information Processing SystemsAug-15-2025, 03:07:27 GMT

Thus, Eq.(7) in the main paper can be written as: e Thus, the transformation in Eq.(7) in the main paper can be expressed as a linear combination of In this section, more experimental results of PKKD are conducted. A T [5] on ResNet-20 using CIFAR-10 dataset as shown in Tab. 1. Table 1: Compared with other methods on ResNet-20 using CIFAR-10 dataset.PKKD ANN + dropout Snapshot-KD [3] SP-KD [2] Gift-KD [4] A T [5] 92.96% 92.20% 92.33% 92.38% 92.22% 92.27% Then, we show the superiority of the proposed methods on the traditional CNN distillation. The results are shown in Tab. 2. Table 2: PKKD and KD in CNN distillation. NP' stands for using progressive or fixed teacher.

knowledge distillation, progressive kernel, supplementary material, (7 more...)

Neural Information Processing Systems

Country: North America > Canada (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.54)

Add feedback

Review for NeurIPS paper: Kernel Based Progressive Distillation for Adder Neural Networks

Neural Information Processing SystemsJan-26-2025, 13:44:52 GMT

Weaknesses: The effectiveness of the kernel method, one of the claimed contributions, is not fully justified. As shown in Table 1, the kernel operation brings insignificant gain on CIFAR 10 with a shallower network of ResNet-20. The gains (below 0.21%) seems insignificant, which may be due to stochastic initialization of networks, suggesting that the proposed kernel scheme may not be so effective as advocated. I advised that comparison on ImageNet with a deeper network (e.g., ResNet-50) is performed. The current experiments are not strong to support that the proposed method is a competitive knowledge distillation method.

adder neural network, distillation, progressive distillation, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.42)

Add feedback

Review for NeurIPS paper: Kernel Based Progressive Distillation for Adder Neural Networks

Neural Information Processing SystemsJan-26-2025, 13:44:46 GMT

I believe that by bridging the gap between Adder NN and CNNs this work provides a considerable contribution, allowing Adder NN to be considered among practical architecture and encouraging the community to research them further. In accordance with the reviewers, I think the proposed method is thoroughly investigated empirically. Please make sure to update the paper with all the results and answers that you have provided in your rebuttal.

adder neural network, neurips paper, progressive distillation, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.40)

Add feedback

Kernel Based Progressive Distillation for Adder Neural Networks

Neural Information Processing SystemsOct-10-2024, 18:44:07 GMT

Adder Neural Networks (ANNs) which only contain additions bring us a new way of developing deep neural networks with low energy consumption. Unfortunately, there is an accuracy drop when replacing all convolution filters by adder filters. The main reason here is the optimization difficulty of ANNs using \ell_1 -norm, in which the estimation of gradient in back propagation is inaccurate. In this paper, we present a novel method for further improving the performance of ANNs without increasing the trainable parameters via a progressive kernel based knowledge distillation (PKKD) method. A convolutional neural network (CNN) with the same architecture is simultaneously initialized and trained as a teacher network, features and weights of ANN and CNN will be transformed to a new space to eliminate the accuracy drop.

adder neural network, kernel, progressive distillation, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.62)

Add feedback

An Empirical Study of Adder Neural Networks for Object Detection

Neural Information Processing SystemsOct-10-2024, 01:14:29 GMT

Adder neural networks (AdderNets) have shown impressive performance on image classification with only addition operations, which are more energy efficient than traditional convolutional neural networks built with multiplications. Compared with classification, there is a strong demand on reducing the energy consumption of modern object detectors via AdderNets for real-world applications such as autonomous driving and face detection. In this paper, we present an empirical study of AdderNets for object detection. We first reveal that the batch normalization statistics in the pre-trained adder backbone should not be frozen, since the relatively large feature variance of AdderNets. Moreover, we insert more shortcut connections in the neck part and design a new feature fusion architecture for avoiding the sparse features of adder layers.

adder neural network, addernet, object detection, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Filters

Collaborating Authors

adder neural network

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Supplementary Materials: An Empirical Study of Adder Neural Networks for Object Detection

Kernel Based Progressive Distillation for Adder Neural Networks

An Empirical Study of Adder Neural Networks for Object Detection

Supplementary Material: Progressive Kernel Based Knowledge Distillation for Adder Neural Networks

Review for NeurIPS paper: Kernel Based Progressive Distillation for Adder Neural Networks

Review for NeurIPS paper: Kernel Based Progressive Distillation for Adder Neural Networks

Kernel Based Progressive Distillation for Adder Neural Networks

An Empirical Study of Adder Neural Networks for Object Detection